This is a read me file for the project named “comparison of lle and pca on local features for lung disease image retrieval”
open the files CBIRloadModel_lle_last_minute.ipynb to look at lle version of project outputs. open the files CBIRloadModel_pca_last_minuteVer2.ipynb to look at pca version of the project outputs.
While these two files are sufficient to execute the project and view the final results, to know how the machine learning models and csv files with features are generated, the below steps are to be followed.
We have taken the dataset from Kaggle. The dataset is provided by NIH and contains images of different lung diseases. There are 14 different classes of diseases. Each photo is marked with its name and disease class in a csv file. To classify the disease into one of the 14 different disease classes, we used a neural network to train the data provided. The project contains 3 main steps.
Orb is abbreviated form of oriented rotation and brief features, it is similar to sift and surf features. ORB uses FAST (features from accelerated segment test) to detect key points and brief to find the descriptors. On each image orb features were calculated and output in the form of a vector was obtained. Vectors obtained from all the images contained features described in small number to large number of elements. The number of elements in the vector can’t be predetermined and depends on the image. Features obtained for each image were added to a matrix. This matrix contains features of all the images.
Local binary pattern features are very effective in finding the local feature in the images. Local binary pattern features are calculated using the following method. a. For each pixel we consider the eight neighbors of it. b. When center pixel is greater than the neighbor pixel we add one to the vector. c. When center pixel is lesser than the neighbor pixel we add zero to the vector. d. After iterating through all the neighbors we have a vector of size 8. e. Steps b, c, d are executed for all the pixels in the image f. As the vector is a binary vector of size 8, the local binary pattern value ranges from 0 to 256. g. To reduce the no of descriptors without losing the rotation invariance, we follow a method called uniform local binary pattern h. If there are 2 or less than 2 transitions in a binary pattern, we call it uniform, else it is non-uniform binary pattern. i. There are 58 binary patterns and all the non-uniform binary patterns are combined into a single bin and we effectively have a 59 element binary pattern instead of 265 element binary pattern. j. The LBP descriptor is a histogram representing frequency of each binary pattern.
The image was divided into different no of parts. The image was divided into 22, 33 and 44 parts respectively. LBP was calculated for each of these images and concatenated into a single vector. The no of features is 59(16+9+4+1) = 1770 descriptors.
Principal component analysis converts the descriptors into linearly independent variables by using orthogonal transformation. Principal component analysis was applied on orb features and LBP features respectively. A fixed size of 500 components was set as output size for PCA.
embedding(LLE): LLE solves globally non-linear problems using locally linear fitting. LLE was applied on orb and LBP features respectively. A fixed size of 500 components was set as output size for LLE.
The outputs obtained by applying dimensionality reduction to LBP and orb features are of size 500 and 500 respectively. Both these features are concatenated to obtain a vector with a size of 1000 elements. These features will be used to train the neural network and find the classification of disease.
To classify the output into one of the 14 classes, an artificial neural network is trained on the dimension reduced feature set. The neural network used is an artificial neural network with 1 input layer, 5 hidden layers and 1 output layer. The input layer contains 1000 neurons as it is the size of feature set. Each subsequent hidden layers contain neurons half the size of previous layer. So the sizes of hidden layers are 500, 250, 125, 63, 32 and 16 respectively. The size of output layer is 14 as it is the number of classes we are classifying the output into. The optimizer used was Adam. Categorical cross entropy loss function and categorical accuracy metric was used to optimize the neural network. The model was fit on the training data obtained from previous steps. This model will be used to predict the disease given the image.
Initially we use the model to predict the probabilities of image belonging to each class. The probabilities range between 0 and 1 for each class. These probabilities were calculated for each image and all the values are stored on the disk. To retrieve images related to the query image, we first calculate the orb and LBP features for new image. Dimensionality reduction was applied on these features, and concatenated. This vector is passed as input to the trained model and output probabilities are obtained.
The highest probability is obtained and is subtracted with probability of all images. The images with least difference are the matching images to the query image.
Precision = tp/(tp+fp) Recall = tp/(tp+fn) Where tp = true positives And fn = false negatives And fp = false positives
Using PCA Precision for image 1 = 4/9 Recall for image 1 = 4/1582
Using LLE Precision for image 1 = 5/9 Recall for image 1 = 5/1582
Using LLE as dimensionality reduction technique, a better precision and recall was obtained compared to PCA technique.
[1] https://pdfs.semanticscholar.org/19f9/863d16b2a3376b27af07fc803e7713c790e0.pdf [2] https://lhncbc.nlm.nih.gov/system/files/pub9175.pdf [3] http://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0082409&type=printable